adaptation strategy
Cola: A Benchmark for Compositional Text-to-image Retrieval
Compositional reasoning is a hallmark of human visual intelligence. Yet, despite the size of large vision-language models, they struggle to represent simple compositions by combining objects with their attributes. To measure this lack of compositional capability, we design Cola, a text-to-image retrieval benchmark to Compose Objects Localized with Attributes. To solve Cola, a model must retrieve images with the correct configuration of attributes and objects and avoid choosing a distractor image with the same objects and attributes but in the wrong configuration. Cola contains about 1.2k composed queries of 168 objects and 197 attributes on around 30K images.
DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning
Diaz-Bone, Leander, Bagatella, Marco, Hübotter, Jonas, Krause, Andreas
Sparse-reward reinforcement learning (RL) can model a wide range of highly complex tasks. Solving sparse-reward tasks is RL's core premise, requiring efficient exploration coupled with long-horizon credit assignment, and overcoming these challenges is key for building self-improving agents with superhuman ability. Prior work commonly explores with the objective of solving many sparse-reward tasks, making exploration of individual high-dimensional, long-horizon tasks intractable. We argue that solving such challenging tasks requires solving simpler tasks that are relevant to the target task, i.e., whose achieval will teach the agent skills required for solving the target task. We demonstrate that this sense of direction, necessary for effective exploration, can be extracted from existing RL algorithms, without leveraging any prior information. To this end, we propose a method for directed sparse-reward goal-conditioned very long-horizon RL (DISCOVER), which selects exploratory goals in the direction of the target task. We connect DISCOVER to principled exploration in bandits, formally bounding the time until the target task becomes achievable in terms of the agent's initial distance to the target, but independent of the volume of the space of all tasks. We then perform a thorough evaluation in high-dimensional environments. We find that the directed goal selection of DISCOVER solves exploration problems that are beyond the reach of prior state-of-the-art exploration methods in RL.
- North America > Canada > Alberta (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Germany (0.04)
- Research Report (0.82)
- Instructional Material > Course Syllabus & Notes (0.64)
- Leisure & Entertainment > Games (1.00)
- Education (0.67)
Domain-Specific Data Generation Framework for RAG Adaptation
Tian, Chris Xing, Xie, Weihao, Chen, Zhen, Yi, Zhengyuan, Liu, Hui, Li, Haoliang, Wang, Shiqi, Ma, Siwei
Retrieval-Augmented Generation (RAG) combines the language understanding and reasoning power of large language models (LLMs) with external retrieval to enable domain-grounded responses. Effectively adapting RAG systems to domain-specific settings requires specialized, context-rich training data beyond general-purpose question-answering. Here, we propose RAGen, a scalable and modular framework for generating domain-grounded question-answer-context (QAC) triples tailored to diverse RAG adaptation approaches. RAGen produces these QAC triples by identifying key concepts in documents, generating diverse questions guided by Bloom's Taxonomy-inspired principles, and pairing them with precise answers extracted from relevant contexts. RAGen supports multiple RAG adaptation strategies, including the optimization of key components such as the LLM, retriever, and embedding model, etc. Its modular pipeline features semantic chunking, hierarchical concept extraction, and multi-chunk retrieval, along with the introduction of curated distractor contexts to promote robust reasoning. Designed for scalability, RAGen efficiently handles large and evolving document corpora without redundant processing, making it especially suitable for dynamic evolving domains such as scientific research and enterprise knowledge bases.
Design Process of a Self Adaptive Smart Serious Games Ecosystem
Tao, X., Chen, P., Tsami, M., Khayati, F., Eckert, M.
Abstract--This paper outlines the design vision and planned evolution of Blexer v3, a modular and AI-driven rehabilitation ecosystem based on serious games. Building on insights from previous versions of the system, we propose a new architecture that aims to integrate multimodal sensing, real-time reasoning, and intelligent control. The envisioned system will include distinct modules for data collection, user state inference, and gameplay adaptation. Key features such as dynamic difficulty adjustment (DDA) and procedural content generation (PCG) are also considered to support personalized interventions. We present the complete conceptual framework of Blexer v3, which defines the modular structure and data flow of the system. This serves as the foundation for the next phase: the development of a functional prototype and its integration into clinical rehabilitation scenarios. Video games have evolved significantly since their inception in the 1960s, becoming a cultural force in the late 1980s and early 1990s [1]. With the growth of the videogame industry, games have expanded into fields such as education, military, and healthcare, known as Serious Games (SGs) [2]. In healthcare, SGs have shown promise in screening [3] and rehabilitation [4].
- Europe > Spain > Galicia > Madrid (0.04)
- Oceania > Australia > Western Australia > Perth (0.04)
- Europe > United Kingdom > England > East Sussex (0.04)
- (2 more...)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.95)
- Health & Medicine > Therapeutic Area > Neurology (0.93)
HeDA: An Intelligent Agent System for Heatwave Risk Discovery through Automated Knowledge Graph Construction and Multi-layer Risk Propagation Analysis
Wang, Yiquan, Huang, Tin-Yeh, Gao, Qingyun, Zhang, Jialin
Heatwaves pose complex cascading risks across interconnected climate, social, and economic systems, but knowledge fragmentation in scientific literature hinders comprehensive understanding of these risk pathways. We introduce HeDA (Heatwave Discovery Agent), an intelligent multi-agent system designed for automated scientific discovery through knowledge graph construction and multi-layer risk propagation analysis. HeDA processes over 10,247 academic papers to construct a comprehensive knowledge graph with 23,156 nodes and 89,472 relationships, employing novel multi-layer risk propagation analysis to systematically identify overlooked risk transmission pathways. Our system achieves 78.9% accuracy on complex question-answering tasks, outperforming state-of-the-art baselines including GPT-4 by 13.7%. Critically, HeDA successfully discovered five previously unidentified high-impact risk chains, such as the pathway where a heatwave leads to a water demand surge, resulting in industrial water restrictions and ultimately causing small business disruption, which were validated through historical case studies and domain expert review. This work presents a new paradigm for AI-driven scientific discovery, providing actionable insights for developing more resilient climate adaptation strategies.
- North America > United States > California (0.04)
- Asia > China > Xinjiang Uygur Autonomous Region (0.04)
- Asia > China > Hong Kong (0.04)
- (5 more...)
- Health & Medicine (0.94)
- Banking & Finance > Economy (0.88)
- Law (0.87)
- Water & Waste Management > Water Management > Water Supplies & Services (0.55)
Avoiding Over-Personalization with Rule-Guided Knowledge Graph Adaptation for LLM Recommendations
Spadea, Fernando, Seneviratne, Oshani
We present a lightweight neuro-symbolic framework to mitigate over-personalization in LLM-based recommender systems by adapting user-side Knowledge Graphs (KGs) at inference time. Instead of retraining models or relying on opaque heuristics, our method restructures a user's Personalized Knowledge Graph (PKG) to suppress feature co-occurrence patterns that reinforce Personalized Information Environments (PIEs), i.e., algorithmically induced filter bubbles that constrain content diversity. These adapted PKGs are used to construct structured prompts that steer the language model toward more diverse, Out-PIE recommendations while preserving topical relevance. We introduce a family of symbolic adaptation strategies, including soft reweighting, hard inversion, and targeted removal of biased triples, and a client-side learning algorithm that optimizes their application per user. Experiments on a recipe recommendation benchmark show that personalized PKG adaptations significantly increase content novelty while maintaining recommendation quality, outperforming global adaptation and naive prompt-based methods.
- North America > United States > New York > Rensselaer County > Troy (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Japan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.84)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.70)
GSTBench: A Benchmark Study on the Transferability of Graph Self-Supervised Learning
Song, Yu, Hua, Zhigang, Xie, Yan, Liu, Jingzhe, Long, Bo, Liu, Hui
Self-supervised learning (SSL) has shown great promise in graph representation learning. However, most existing graph SSL methods are developed and evaluated under a single-dataset setting, leaving their cross-dataset transferability largely unexplored and limiting their ability to leverage knowledge transfer and large-scale pretraining, factors that are critical for developing generalized intelligence beyond fitting training data. To address this gap and advance foundation model research for graphs, we present GSTBench, the first systematic benchmark for evaluating the transferability of graph SSL methods. We conduct large-scale pretraining on ogbn-papers100M and evaluate five representative SSL methods across a diverse set of target graphs. Our standardized experimental setup decouples confounding factors such as model architecture, dataset characteristics, and adaptation protocols, enabling rigorous comparisons focused solely on pretraining objectives. Surprisingly, we observe that most graph SSL methods struggle to generalize, with some performing worse than random initialization. In contrast, GraphMAE, a masked autoencoder approach, consistently improves transfer performance. We analyze the underlying factors that drive these differences and offer insights to guide future research on transferable graph SSL, laying a solid foundation for the "pretrain-then-transfer" paradigm in graph learning. Our code is available at https://github.com/SongYYYY/GSTBench.
- Asia > South Korea > Seoul > Seoul (0.05)
- North America > United States > Michigan > Ingham County > Lansing (0.04)
- North America > United States > Michigan > Ingham County > East Lansing (0.04)
- (5 more...)
RTTC: Reward-Guided Collaborative Test-Time Compute
Test-Time Compute (TTC) has emerged as a powerful paradigm for enhancing the performance of Large Language Models (LLMs) at inference, leveraging strategies such as Test-Time Training (TTT) and Retrieval-Augmented Generation (RAG). However, the optimal adaptation strategy varies across queries, and indiscriminate application of TTC strategy incurs substantial computational overhead. In this work, we introduce Reward-Guided Test-Time Compute (RTTC), a novel framework that adaptively selects the most effective TTC strategy for each query via a pretrained reward model, maximizing downstream accuracy across diverse domains and tasks. RTTC operates in a distributed server-client architecture, retrieving relevant samples from a remote knowledge base and applying RAG or lightweight fine-tuning on client devices only when necessary. To further mitigate redundant computation, we propose Query-State Caching, which enables the efficient reuse of historical query states at both retrieval and adaptation levels. Extensive experiments across multiple LLMs and benchmarks demonstrate that RTTC consistently achieves superior accuracy compared to vanilla RAG or TTT, validating the necessity of adaptive, reward-guided TTC selection and the potential of RTTC for scalable, high-performance language model adaptation.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Beijing > Beijing (0.04)